Y

YouLibs

Remove Touch Overlay

An Analysis of Societal Bias in Sota NLP Transfer Learning | PyData Global 2021

Duration: 30:30Views: 328Likes: 7Date Created: Jan, 2022

Channel: PyData

Tags: python learn to code education software pydata learn coding how to program julia opensource scientific programming numfocus python 3 tutorial

Description: An Analysis of Societal Bias in Sota NLP Transfer Learning Speakers: Benjamin Ajayi-Obe, David Hopes Summary The popularisation of large pre-trained language models has resulted in their increased adoption in commercial settings. However, these models are usually pre-trained on raw, unprocessed corpora that are known to contain a plethora of societal biases. In this talk, we explore the sources of this bias, as well as recent methods of measuring and mitigating it. Description Since the publication of Google’s seminal paper, “Attention is all you need”, attention based transformers have become widely celebrated and adopted for their impressive ability to emulate human-like text. However, it has become increasingly evident that, while these models are very capable of modelling text from a large corpus, they also embed societal biases present in the data. These biases can be difficult to detect unless intentionally inspected for or documented, and so they pose a real risk to organisations who wish to make use of state of the art NLP models, particularly those who have limited budgets to retrain them. This talk is for anyone who wishes to deepen their understanding of attention based transformers from an ethical standpoint and also those looking to deploy attention based models in a commercial setting. You will leave with a better understanding of the types of biases that pose a risk to attention based models, the source of this bias and potential strategies for mitigating against it. For this talk we presume the audience has a high level understanding of neural networks and some knowledge of linear algebra. The first 15 minutes will be a discussion around the types of bias that pose a risk to these models as well as some demonstrations of biased outputs. The second 15 minutes will be an exploration into strategies to detect and mitigate against these biases. Benjamin Ajayi-Obe's Bio I am a data scientist in the ranking and recommendation team of Depop. I am interested in the the development and application of NLP models for commercial use. I am also interested in the ethical implications of deploying AI solutions in the real world and exploring ways of ensuring fairness, equity and safety in a society that is increasingly adopting ML. GitHub: github.com/BenAjayiObe Twitter: twitter.com/BOA4 LinkedIn: linkedin.com/in/benjamin-ajayi-obe-14183073 David Hopes's Bio Data scientist and ethical AI ambassador at Depop, currently focusing on ML solutions for the marketing technology team. Research interests in computational pragmatics and context in communication. GitHub: github.com/davidhopes Twitter: twitter.com/davidghopes LinkedIn: linkedin.com/in/davidhopes PyData Global 2021 Website: pydata.org/global2021 LinkedIn: linkedin.com/company/pydata-global Twitter: twitter.com/PyData pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome! 00:10 Help us add time stamps or captions to this video! See the description for details. Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: github.com/numfocus/YouTubeVideoTimestamps